Keyword [Shufflenet]

Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 6848-6856.

1. Overview

1.1. Motivation

the limitation of computing power on mobiledevices
costy dense 1x1 convolutions

In this paper, it proposed ShuffleNet

pointwise group convolution
Depthwise convolution
channel shuffle.

(a) output from a certain channel are only derived from a small fraction of input channel. block information flow between channel group and weakens representation

1.2. Channel Shuffle

for a gn (group, number of each group) feature map

reshape to gxn
transpose nxg
reshape to ng

1.3. ShuffleNet Unit

1.4. Architecture

1.5. Unit Comparison

for a point of c channels feature map, m channels of bottleneck

ResNet. 2cm + 9mm
ResNeXt. 2cm + 9mm/g
ShuffleNet. 2cm/g + 9m
ShuffleNet apply group convolution to two 1x1 pointwise convolution.

1.6.1. Model

GoogleNet
SqueezeNet
SENet
NASNet

1.6.2. Group Convolution

AlexNet. 50% kernel on first GPU, 50% on second GPU
ResNeXt
Xception. depthwise
MobileNet. depthwise

1.6.3. Channel Shuffle Operation

cuda-convnet. random sparse convolution layer, equivalent to random channel shuffle + group Conv

1.6.4. Model Acceleration

Pruning connection
Channel reduction
Quantization
Factorization
Implement convolution by FFT
Distillation
PVANET

2. Experiments

2.1. Hyperparameter

2.2. Shuffle Channel

sx. means the scale of channel, sxs times complexity of 1x.

2.3. Comparison

18 times faster than AlexNet

2.4. Inference Time on Mobile Devices

Empirically g=3 has a proper trade-off between accuracy and actual inference time